Goto

Collaborating Authors

 data row


TEN: Table Explicitization, Neurosymbolically

Mehrotra, Nikita, Kumar, Aayush, Gulwani, Sumit, Radhakrishna, Arjun, Tiwari, Ashish

arXiv.org Artificial Intelligence

We present a neurosymbolic approach, TEN, for extracting tabular data from semistructured input text. This task is particularly challenging for text input that does not use special delimiters consistently to separate columns and rows. Purely neural approaches perform poorly due to hallucinations and their inability to enforce hard constraints. TEN uses Structural Decomposition prompting - a specialized chain-of-thought prompting approach - on a large language model (LLM) to generate an initial table, and thereafter uses a symbolic checker to evaluate not only the well-formedness of that table, but also detect cases of hallucinations or forgetting. The output of the symbolic checker is processed by a critique-LLM to generate guidance for fixing the table, which is presented to the original LLM in a self-debug loop. Our extensive experiments demonstrate that TEN significantly outperforms purely neural baselines across multiple datasets and metrics, achieving significantly higher exact match accuracy and substantially reduced hallucination rates. A 21-participant user study further confirms that TEN's tables are rated significantly more accurate (mean score: 5.0 vs 4.3; p = 0.021), and are consistently preferred for ease of verification and correction, with participants favoring our method in over 60% of the cases.


4 advanced Machine Learning techniques

#artificialintelligence

What are the typical ways to improve a Machine Learning model? Performing selection to remove redundant features. Yes, all of this may work, but there are many other approaches to improve the score. Maybe some of them are a little bit less known, and some are not applicable in all cases, but when applied at the right time they can bring significant improvement. Let's go through some of them.


Labeling with Active Learning - DataScienceCentral.com

#artificialintelligence

We are in the age of data. In recent years, many companies have already started collecting large amounts of data about their business. On the other hand, many companies are just starting now. If you are working in one of these companies, you might be wondering what can be done with all that data. What about using the data to train a supervised machine learning (ML) algorithm? The ML algorithm could perform the same classification task a human would, just so much faster!


Exemplars can Reciprocate Principal Components

Greer, Kieran

arXiv.org Artificial Intelligence

This paper presents a clustering algorithm that is an extension of the Category Trees algorithm. Category Trees is a clustering method that creates tree structures that branch on category type and not feature. The development in this paper is to consider a secondary order of clustering that is not the category to which the data row belongs, but the tree, representing a single classifier, that it is eventually clustered with. Each tree branches to store subsets of other categories, but the rows in those subsets may also be related. This paper is therefore concerned with looking at that second level of clustering between the other category subsets, to try to determine if there is any consistency over it. It is argued that Principal Components may be a related and reciprocal type of structure, and there is an even bigger question about the relation between exemplars and principal components, in general. The theory is demonstrated using the Portugal Forest Fires dataset as a case study. The distributed nature of that dataset can artificially create the tree categories and the output criterion can also be determined in an automatic and arbitrary way, leading to a flexible and dynamic clustering mechanism.


A Pattern-Hierarchy Classifier for Reduced Teaching

Greer, Kieran

arXiv.org Artificial Intelligence

This paper uses a branching classifier mechanism in an unsupervised scenario, to enable it to self-organise data into unknown categories. A teaching phase is then able to help the classifier to learn the true category for each input row, using a reduced number of training steps. The pattern ensembles are learned in an unsupervsised manner that use a closest-distance clustering. This is done without knowing what the actual output category is and leads to each actual category having several clusters associated with it. One measure of success is then that each of these sub-clusters is coherent, which means that every data row in the cluster belongs to the same category. The total number of clusters is also important and a teaching phase can then teach the classifier what the correct actual category is. During this phase, any classifier can also learn or infer correct classifications from some other classifier's knowledge, thereby reducing the required number of presentations. As the information is added, cross-referencing between the two structures allows it to be used more widely. With this process, a unique structure can build up that would not be possible by either method separately. The lower level is a nested ensemble of patterns created by self-organisation. The upper level is a hierarchical tree, where each end node represents a single category only, so there is a transition from mixed ensemble masses to specific categories. The structure also has relations to brain-like modelling.


Avoiding Human Error When Building Artificial Intelligence

#artificialintelligence

Many real-life databases contain missing values. Yet many popular algorithms and statistical models do not accept data rows containing missing values. Some libraries drop these data rows with little warning. Without those data rows, a model is likely to make biased predictions. For example, a majority of the rows in the Lending Club data have never had a negative credit action and therefore contain a missing value.


An Experiment with Bands and Dimensions in Classifiers

Greer, Kieran

arXiv.org Artificial Intelligence

This paper presents a new version of an oscillating error classifier that has added fixed value ranges through bands, for each column or feature of the input dataset. An earlier version of the classifier added branches [8] to a categorical classification technique that allows the error update to be independent for each column value and can therefore oscillate around the desired output, reducing to some minimum. Because that classifier works off averaged values, it may be the case that some data can be classified directly, without it having to be sorted by weight sets, for example. The averaged value is simply 1 value for a whole range of actual input values and so maybe a value band can represent that range as a fixed set of boundaries. It may also be possible to construct these fixed boundaries for single dimensions, when much more complex hypercubes are not required. It is shown that some of the data can in fact be correctly classified through using fixed value ranges only, while the rest can be classified by using the classifiers. With the idea of these fixed bands that do not process very much, plus the more complex classifiers, the paper also presents the whole system in terms of a biological model of neurons and neuron links.


A New Oscillating-Error Technique for Classifiers

Greer, Kieran

arXiv.org Artificial Intelligence

This paper describes a new method for reducing the error in a classifier. It uses an error correction update that includes the very simple rule of either adding or subtracting the error adjustment, based on whether the variable value is currently larger or smaller than the desired value. While a traditional neuron would sum the inputs together and then apply a function to the total, this new method can change the function decision for each input value. This gives added flexibility to the convergence procedure, where through a series of transpositions, variables that are far away can continue towards the desired value, whereas variables that are originally much closer can oscillate from one side to the other. Tests show that the method can successfully classify some benchmark datasets. It can also work in a batch mode, with reduced training times and can be used as part of a neural network architecture. Some comparisons with an earlier wave shape paper are also made.


An Improved Oscillating-Error Classifier with Branching

Greer, Kieran

arXiv.org Machine Learning

This paper extends the earlier work, based on an oscillating error correction technique [7]. The method uses an error correction update that includes a very simple rule, of either adding or subtracting the error adjustment, based on whether the variable value is currently larger or smaller than the desired value. This has relations with cellular automata [2], where the small add or subtract decision gives the classifier an added dimension of flexibility. The results reported in the first paper were unusually good over a wide range of datasets and it was subsequently found that an error had been made in how the classifier decides on the correct output category. The earlier paper measured the error amount between the desired output value and the value produced by the corresponding classifier. If the error was small enough, then the classification was considered to be correct. When training the classifier, the data rows for each category would be put together and averaged. The classifier would then try to learn these average values, but that would lead to distinct weight sets for each output category. It was overlooked that even if the desired output category correctly classified the input data row, one of the other category weight sets could produce an even smaller error.


Startup Unveils Machine Learning Products Based on Novel Approach to AI

#artificialintelligence

Gamalon Inc, emerged from stealth mode this week, announced two machine learning products, based on an in-house technology known as Bayesian Program Synthesis (BPS). The company claims BPS can perform machine learning tasks 100 times faster than conventional deep learning techniques, while providing more accurate results. "We call our way of doing this Bayesian program learning," said Gamalon founder and CEO, Ben Vigoda at a recent TED talk. He believes using Bayesian probabilistic modeling is a much more efficient way, that is, a much less computationally intensive way, to infuse intelligence into machines. Unlike deep learning, which often needs millions of data examples to train a neural network, a Bayesian model can be built with much fewer examples.